Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38324432

RESUMO

To automatically mine structured semantic topics from text, neural topic modeling has arisen and made some progress. However, most existing work focuses on designing a mechanism to enhance topic coherence but sacrificing the diversity of the extracted topics. To address this limitation, we propose the first neural-based topic modeling approach purely based on mutual information maximization, called the mutual information topic (MIT) model, in this article. The proposed MIT significantly improves topic diversity by maximizing the mutual information between word distribution and topic distribution. Meanwhile, MIT also utilizes Dirichlet prior in latent topic space to ensure the quality of mined topics. The experimental results on three publicly benchmark text corpora show that MIT could extract topics with higher coherence values (considering four topic coherence metrics) than competitive approaches and has a significant improvement on topic diversity metric. Besides, our experiments prove that the proposed MIT converges faster and more stable than adversarial-neural topic models.

2.
Environ Sci Pollut Res Int ; 30(59): 123862-123881, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37995031

RESUMO

As a bridge between economy and ecology, green finance is vital in improving environmental quality and promoting sustainable development. Based on the building of an environmental pollution index system, this paper constructs the [Formula: see text] model to deeply explore the specific impact of green finance on environmental pollution using China's provincial panel data from 2007 to 2020. This paper constructs an intermediary model to test the impact mechanism of green finance on reducing environmental pollution and discusses the regional heterogeneity of green finance in reducing environmental pollution. The results show that (1) green finance can significantly reduce environmental pollution, among which green credit has a pronounced effect on reducing environmental pollution, green investment has a relatively small effect, and green securities have not significant effect. (2) Green finance has the best inhibitory effect on solid pollution, less inhibitory effect on air pollution, and no significant improvement effect on water pollution. (3) Green technology innovation, industrial structure upgrading, and environmental regulation play an intermediary role in the process of green finance reducing environmental pollution and improving environmental quality. (4) The effect of green finance in the eastern and carbon emission pilot areas is significantly better than in the central and western regions and non-carbon emission pilot areas respectively. According to the research results of this paper, suggestions are put forward to promote the development of green finance, which is of great significance to reducing environmental pollution and achieving sustainable development goals.


Assuntos
Poluição do Ar , Poluição do Ar/prevenção & controle , Poluição da Água , Carbono , Ecologia , China , Desenvolvimento Econômico
3.
Front Psychol ; 14: 1026638, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36844331

RESUMO

This paper explores how older adults with different cognitive abilities perform the refusal speech act in the cognitive assessment in the setting of memory clinics. The refusal speech act and its corresponding illocutionary force produced by nine Chinese older adults in the Montreal Cognitive Assessment-Basic was annotated and analyzed from a multimodal perspective. Overall, regardless of the older adults' cognitive ability, the most common discursive device to refuse is the demonstration of their inability to carry out or continue the cognitive task. Individuals with lower cognitive ability were found to perform the refusal illocutionary force (hereafter RIF) with higher frequency and degree. Additionally, under the pragmatic compensation mechanism, which is influenced by cognitive ability, multiple expression devices (including prosodic features and non-verbal acts) interact dynamically and synergistically to help older adults carry out the refusal behavior and to unfold older adults' intentional state and emotion as well. The findings indicate that both the degree and the frequency of performing the refusal speech act in the cognitive assessment are related to the cognitive ability of older adults.

4.
Artigo em Inglês | MEDLINE | ID: mdl-35576420

RESUMO

Biomedical argument mining aims to automatically identify and extract the argumentative structure in biomedical text. It helps to determine not only what positions people adopt, but also why they hold such opinions, which provides valuable insights into medical decision making. Generally, biomedical argument mining consists of three subtasks: argument component identification, argument component classification and relation identification. Current approaches employ conventional multi-task learning framework for jointly addressing the latter two subtasks, and achieve some success. However, explicit sequential dependency between these two subtasks is ignored, which is crucial for accurate biomedical argument mining. Moreover, relation identification is conducted solely based on the argument component pair without considering its potentially valuable context. Therefore, in this paper, a novel sequential multi-task learning approach is proposed for biomedical argument mining. Specifically, to model explicit sequential dependency between argument component classification and relation identification, an information transfer strategy is employed to capture the information of argument component type that is transferred to relation identification. Furthermore, graph convolutional network is employed to model dependency relation among the related argument component pairs. The proposed method has been evaluated on a benchmark dataset and the experimental results show that the proposed method outperforms the state-of-the-art methods.


Assuntos
Benchmarking , Tomada de Decisão Clínica , Humanos
5.
PLoS One ; 17(10): e0275998, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36301794

RESUMO

The steam turbine is one of the major pieces of equipment in thermal power plants. It is crucial to predict its output accurately. However, because of its complex coupling relationships with other equipment, it is still a challenging task. Previous methods mainly focus on the operation of the steam turbine individually while ignoring the coupling relationship with the condenser, which we believe is crucial for the prediction. Therefore, in this paper, to explore the coupling relationship between steam turbine and condenser, we propose a novel approach for steam turbine power prediction based on the encode-decoder framework guided by the condenser vacuum degree (CVD-EDF). In specific, the historical information within condenser operation conditions data is encoded using a long-short term memory network. Moreover, a connection module consisting of an attention mechanism and a convolutional neural network is incorporated to capture the local and global information in the encoder. The steam turbine power is predicted based on all the information. In this way, the coupling relationship between the condenser and the steam turbine is fully explored. Abundant experiments are conducted on real data from the power plant. The experimental results show that our proposed CVD-EDF achieves great improvements over several competitive methods. our method improves by 32.2% and 37.0% in terms of RMSE and MAE by comparing the LSTM at one-minute intervals.


Assuntos
Doenças Cardiovasculares , Vapor , Humanos , Vácuo , Centrais Elétricas , Redes Neurais de Computação
6.
IEEE/ACM Trans Comput Biol Bioinform ; 19(4): 2365-2376, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-33974546

RESUMO

Biomedical factoid question answering is an important task in biomedical question answering applications. It has attracted much attention because of its reliability. In question answering systems, better representation of words is of great importance, and proper word embedding can significantly improve the performance of the system. With the success of pretrained models in general natural language processing tasks, pretrained models have been widely used in biomedical areas, and many pretrained model-based approaches have been proven effective in biomedical question-answering tasks. In addition to proper word embedding, name entities also provide important information for biomedical question answering. Inspired by the concept of transfer learning, in this study, we developed a mechanism to fine-tune BioBERT with a named entity dataset to improve the question answering performance. Furthermore, we applied BiLSTM to encode the question text to obtain sentence-level information. To better combine the question level and token level information, we use bagging to further improve the overall performance. The proposed framework was evaluated on BioASQ 6b and 7b datasets, and the results have shown that our proposed framework can outperform all baselines.


Assuntos
Aprendizado de Máquina , Processamento de Linguagem Natural , Idioma , Aprendizagem , Reprodutibilidade dos Testes
7.
Artif Intell Med ; 118: 102119, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34412842

RESUMO

OBJECTIVE: Health issue identification in social media is to predict whether the writers have a disease based on their posts. Numerous posts and comments are shared on social media by users. Certain posts may reflect writers' health condition, which can be employed for health issue identification. Usually, the health issue identification problem is formulated as a classification task. METHODS AND MATERIAL: In this paper, we propose novel multi-task hierarchical neural networks with topic attention for identifying health issue based on posts collected from the social media platforms. Specifically, the model incorporates the hierarchical relationship among the document, sentences, and words via bidirectional gated recurrent units (BiGRUs). The global topic information shared across posts is incorporated with the hidden states of BiGRUs to obtain the topic-enhanced attention weights for words. In addition, tasks of predicting whether the writers suffer from a disease (health issue identification) and predicting the specific domain of the posts (domain category classification) are learned jointly in multi-task mechanism. RESULTS: The proposed method is evaluated on two datasets: dementia issue dataset and depression issue dataset. The proposed approach achieves 98.03% and 88.28% F-1 score on two datasets, outperforming the state-of-the-art approach by 0.73% and 0.4% respectively. Further experimental analysis shows the effectiveness of incorporating both the multi-task learning framework and topic attention mechanism.


Assuntos
Mídias Sociais , Humanos , Idioma , Redes Neurais de Computação
8.
J Affect Disord ; 295: 148-155, 2021 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-34461370

RESUMO

BACKGROUND: Objective biomarkers are crucial for overcoming the clinical dilemma in major depressive disorder (MDD), and the individualized diagnosis is essential to facilitate the precise medicine for MDD. METHODS: Sleep disturbance-related magnetic resonance imaging (MRI) features was identified in the internal dataset (92 MDD patients) using the relevance vector regression algorithm, which was further verified in 460 MDD patients of an independent, multicenter dataset. Subsequently, using these MRI features, the eXtreme Gradient Boosting classification model was constructed in the current multicenter dataset (460 MDD patients and 470 normal controls). Meanwhile, the association between classification outputs and the severity of depressive symptoms was also investigated. RESULTS: In MDD patients, the combination of gray matter density and fractional amplitude of low-frequency fluctuation can accurately predict individual sleep disturbance score that was calculated by the sum of item 4 score, item 5 score, and item 6 score of the 17-Item Hamilton Rating Scale for Depression (HAMD-17) (R2 = 0.158 in the internal dataset; R2 = 0.110 in multicenter dataset). Furthermore, the classification model based on these MRI features distinguished MDD patients from normal controls with 86.3% accuracy (area under the curve = 0.937). Importantly, the classification outputs significantly correlated with HAMD-17 scores in MDD patients. LIMITATION: Lacking some specialized tools to assess the personal sleep quality, e.g. Pittsburgh Sleep Quality Index. CONCLUSION: Neuroimaging features can reflect accurately individual sleep disturbance manifestation and serve as potential diagnostic biomarkers of MDD.


Assuntos
Transtorno Depressivo Maior , Biomarcadores , Transtorno Depressivo Maior/diagnóstico por imagem , Humanos , Aprendizado de Máquina , Neuroimagem , Sono
9.
ACS Chem Neurosci ; 12(15): 2878-2886, 2021 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-34282889

RESUMO

Diagnosis of major depressive disorder (MDD) using resting-state functional connectivity (rs-FC) data faces many challenges, such as the high dimensionality, small samples, and individual difference. To assess the clinical value of rs-FC in MDD and identify the potential rs-FC machine learning (ML) model for the individualized diagnosis of MDD, based on the rs-FC data, a progressive three-step ML analysis was performed, including six different ML algorithms and two dimension reduction methods, to investigate the classification performance of ML model in a multicentral, large sample dataset [1021 MDD patients and 1100 normal controls (NCs)]. Furthermore, the linear least-squares fitted regression model was used to assess the relationships between rs-FC features and the severity of clinical symptoms in MDD patients. Among used ML methods, the rs-FC model constructed by the eXtreme Gradient Boosting (XGBoost) method showed the optimal classification performance for distinguishing MDD patients from NCs at the individual level (accuracy = 0.728, sensitivity = 0.720, specificity = 0.739, area under the curve = 0.831). Meanwhile, identified rs-FCs by the XGBoost model were primarily distributed within and between the default mode network, limbic network, and visual network. More importantly, the 17 item individual Hamilton Depression Scale scores of MDD patients can be accurately predicted using rs-FC features identified by the XGBoost model (adjusted R2 = 0.180, root mean squared error = 0.946). The XGBoost model using rs-FCs showed the optimal classification performance between MDD patients and HCs, with the good generalization and neuroscientifical interpretability.


Assuntos
Transtorno Depressivo Maior , Encéfalo/diagnóstico por imagem , Mapeamento Encefálico , Humanos , Aprendizado de Máquina , Imageamento por Ressonância Magnética
10.
Ann Transl Med ; 9(4): 316, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33708943

RESUMO

BACKGROUND: Diabetes has significant effects on bone metabolism. Both type 1 and type 2 diabetes can cause osteoporotic fracture. However, it remains challenging to diagnose osteoporosis in type 2 diabetes by bone mineral density which lacks regular changes. Seen another way, osteoporosis can be ascribed to the imbalance of bone metabolism, which is closely related to diabetes as well. METHODS: Here, to assist clinicians in diagnosing osteoporosis in type 2 diabetes, an efficient and simple SVM (support vector machine) model was established based on different combinations of biochemical indexes, which were collected from patients who did the test of bone turn-over markers (BTMs) from January 2016 to March 2018 in the department of endocrine, Zhongda Hospital affiliated to Southeast University. The classification was done based on a software package of machine learning in Python. The classification performance was measured by SKLearn program incorporated in the Python software package and compared with the clinical diagnostic results. RESULTS: The predicting accuracy rate of final model was above 88%, with feature combination of sex, age, BMI (body mass index), TP1NP (total procollagen I N-terminal propeptide) and OSTEOC (osteocalcin). CONCLUSIONS: Experimental results show that the model showed an anticipant result for early detection and daily monitoring on type 2 diabetic osteoporosis.

11.
IEEE/ACM Trans Comput Biol Bioinform ; 17(6): 2029-2039, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31095491

RESUMO

Biomedical event extraction plays an important role in the extraction of biological information from large-scale scientific publications. However, most state-of-the-art systems separate this task into several steps, which leads to cascading errors. In addition, it is complicated to generate features from syntactic and dependency analysis separately. Therefore, in this paper, we propose an end-to-end model based on long short-term memory (LSTM) to optimize biomedical event extraction. Experimental results demonstrate that our approach improves the performance of biomedical event extraction. We achieve average F1-scores of 59.68, 58.23, and 57.39 percent on the BioNLP09, BioNLP11, and BioNLP13's Genia event datasets, respectively. The experimental study has shown our proposed model's potential in biomedical event extraction.


Assuntos
Pesquisa Biomédica/classificação , Biologia Computacional/métodos , Mineração de Dados/métodos , Redes Neurais de Computação
12.
ACS Chem Neurosci ; 10(8): 3479-3485, 2019 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-31145586

RESUMO

The objective of the study was to explore the potential value of plasma indicators for identifying amnesic mild cognitive impairment (aMCI) and determine whether levels of plasma indicators are related to the performance of cognitive function and brain tissue volumes. In total, 155 participants (68 aMCI patients and 87 health controls) were recruited in the present cross-sectional study. The levels of plasma amyloid-ß (Aß) 40, Aß42, total tau (t-tau), and neurofilament light (NFL) were measured using an ultrasensitive quantitative method. Machine learning algorithms were performed for establishing an optimal model of identifying aMCI. Compared with healthy controls, Aß40 and Aß42 levels were lower and NFL levels were higher in plasma of aMCI patients with an exception of t-tau levels. In aMCI patients, the higher plasma Aß40 levels were correlated with the impaired episodic memory and negative correlations were observed between plasma t-tau levels and global cognitive function and gray matter (GM) volume. In addition, the higher plasma NFL levels were correlated with reduced hippocampus volume and total GM volume of the left inferior and middle temporal gyrus. An integrated model included clinical features, hippocampus volume, and plasma Aß42 and NFL and had the highest accuracy for detecting aMCI patients (accuracy, 74.2%). We demonstrated that plasma Aß40, Aß42, t-tau, and NFL may be useful to identify aMCI and correlate with cognitive decline and brain atrophy. Among these plasma indicators, Aß42 and NFL are more valuable as key members of a peripheral biomarker panel to detect aMCI.


Assuntos
Peptídeos beta-Amiloides/sangue , Disfunção Cognitiva/sangue , Disfunção Cognitiva/diagnóstico , Diagnóstico Precoce , Proteínas de Neurofilamentos/sangue , Proteínas tau/sangue , Idoso , Doença de Alzheimer/sangue , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/patologia , Biomarcadores/sangue , Encéfalo/patologia , Disfunção Cognitiva/patologia , Estudos Transversais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
13.
Virus Genes ; 54(5): 662-671, 2018 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30105631

RESUMO

Despite the notable success of combination antiretroviral therapy, how to eradicate latent HIV-1 from reservoirs poses a challenge. The Tat protein plays an indispensable role in HIV reactivation and histone demethylase LSD1 promotes Tat-mediated long terminal repeats (LTR) activation. However, the role of LSD1 in remodeling chromatin and the role of its component BHC80 in activation of latent HIV-1 in T cells are unknown. Our findings indicate that LSD1 could decrease the level of histone H3 lysine 4 trimethylation (H3K4me3) at the HIV-1 promoter by recruiting histone lysine demethylase 5A (KDM5A) and preventing histone methyltransferase Set1A and WD-40 repeat protein 5 (WDR5) from binding to LTR. Moreover, BHC80 is necessary for LSD1-triggered LTR activation and assists LSD1 in activating LTR by binding to nucleotides 305-631 of LTR. In activated J-Lat-A2 cells, BHC80 expression was elevated and its isoform BHC80-6 promoted the association of BHC80 with LSD1. These results suggest that the LSD1-BHC80 complex enhances HIV-1 transcription by a decrease of H3K4me3 level at the viral promoter. Therefore, it might be used as a new drug target to reactivate latent HIV-1.


Assuntos
HIV-1/metabolismo , Histona Desacetilases/metabolismo , Histona Desmetilases/metabolismo , Produtos do Gene tat do Vírus da Imunodeficiência Humana/metabolismo , Sítios de Ligação , Células HEK293 , HIV-1/genética , Células HeLa , Humanos , Células Jurkat , Regiões Promotoras Genéticas , Ligação Proteica , Fator de Transcrição Sp1/metabolismo , Sequências Repetidas Terminais , Ativação Transcricional , Ativação Viral
14.
Virol Sin ; 33(3): 261-269, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29737506

RESUMO

Despite the success of combined antiretroviral therapy in recent years, the prevalence of human immunodeficiency virus (HIV)-associated neurocognitive disorders in people living with HIV-1 is increasing, significantly reducing the health-related quality of their lives. Although neurons cannot be infected by HIV-1, shed viral proteins such as transactivator of transcription (Tat) can cause dendritic damage. However, the detailed molecular mechanism of Tat-induced neuronal impairment remains unknown. In this study, we first showed that recombinant Tat (1-72 aa) induced neurotoxicity in primary cultured mouse neurons. Second, exposure to Tat1-72 was shown to reduce the length and number of dendrites in cultured neurons. Third, Tat1-72 (0-6 h) modulates protein phosphatase 1 (PP1) expression and enhances its activity by decreasing the phosphorylation level of PP1 at Thr320. Finally, Tat1-72 (24 h) downregulates CREB activity and CREB-mediated gene (BDNF, c-fos, Egr-1) expression. Together, these findings suggest that Tat1-72 might impair cognitive function by regulating the activity of PP1 and the CREB/BDNF pathway.


Assuntos
Fator Neurotrófico Derivado do Encéfalo/metabolismo , Proteína de Ligação ao Elemento de Resposta ao AMP Cíclico/metabolismo , Dendritos/metabolismo , HIV-1/metabolismo , Neurônios/metabolismo , Proteína Fosfatase 1/metabolismo , Produtos do Gene tat do Vírus da Imunodeficiência Humana/farmacologia , Animais , Western Blotting , Células Cultivadas , Dendritos/efeitos dos fármacos , Camundongos , Camundongos Endogâmicos C57BL , Neurônios/efeitos dos fármacos , Transdução de Sinais/fisiologia
15.
Artif Intell Med ; 87: 1-8, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29559249

RESUMO

OBJECTIVE: A drug-drug interaction (DDI) is a situation in which a drug affects the activity of another drug synergistically or antagonistically when being administered together. The information of DDIs is crucial for healthcare professionals to prevent adverse drug events. Although some known DDIs can be found in purposely-built databases such as DrugBank, most information is still buried in scientific publications. Therefore, automatically extracting DDIs from biomedical texts is sorely needed. METHODS AND MATERIAL: In this paper, we propose a novel position-aware deep multi-task learning approach for extracting DDIs from biomedical texts. In particular, sentences are represented as a sequence of word embeddings and position embeddings. An attention-based bidirectional long short-term memory (BiLSTM) network is used to encode each sentence. The relative position information of words with the target drugs in text is combined with the hidden states of BiLSTM to generate the position-aware attention weights. Moreover, the tasks of predicting whether or not two drugs interact with each other and further distinguishing the types of interactions are learned jointly in multi-task learning framework. RESULTS: The proposed approach has been evaluated on the DDIExtraction challenge 2013 corpus and the results show that with the position-aware attention only, our proposed approach outperforms the state-of-the-art method by 0.99% for binary DDI classification, and with both position-aware attention and multi-task learning, our approach achieves a micro F-score of 72.99% on interaction type identification, outperforming the state-of-the-art approach by 1.51%, which demonstrates the effectiveness of the proposed approach.


Assuntos
Mineração de Dados , Aprendizado Profundo , Interações Medicamentosas , Bases de Dados Factuais , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos
16.
Virol Sin ; 31(3): 199-206, 2016 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-27007880

RESUMO

The multifunctional trans-activator Tat is an essential regulatory protein for HIV-1 replication and is characterized by high sequence diversity. Numerous experimental studies have examined Tat in HIV-1 subtype B, but research on subtype C Tat is lacking, despite the high prevalence of infections caused by subtype C worldwide. We hypothesized that amino acid differences contribute to functional differences among Tat proteins. In the present study, we found that subtype B NL4-3 Tat and subtype C isolate HIV1084i Tat exhibited differences in stability by overexpressing the fusion protein Tat-Flag. In addition, 1084i Tat can activate LTR and NF-κB more efficiently than NL4-3 Tat. In analyses of the activities of the truncated forms of Tat, we found that the carboxyl-terminal region of Tat regulates its stability and transactivity. According to our results, we speculated that the differences in stability between B-Tat and C-Tat result in differences in transactivation ability.


Assuntos
HIV-1/metabolismo , Produtos do Gene tat do Vírus da Imunodeficiência Humana/química , Produtos do Gene tat do Vírus da Imunodeficiência Humana/metabolismo , Substituição de Aminoácidos , Células HEK293 , HIV-1/química , HIV-1/genética , Humanos , NF-kappa B/metabolismo , Estabilidade Proteica , Proteínas Recombinantes de Fusão/química , Proteínas Recombinantes de Fusão/genética , Deleção de Sequência , Relação Estrutura-Atividade , Ativação Transcricional , Replicação Viral , Produtos do Gene tat do Vírus da Imunodeficiência Humana/genética , Produtos do Gene tat do Vírus da Imunodeficiência Humana/imunologia
17.
Virus Genes ; 52(2): 179-88, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26832332

RESUMO

The multifunctional transactivator Tat protein is an essentially regulatory protein for HIV-1 replication and it plays a role in pathogenesis of HIV-1 infection. At present, numerous experimental studies about HIV-1 Tat focus on subtype B, very few has been under study of subtype C-Tat. In view of the amino acid variation of the clade-specific Tat proteins, we hypothesized that the amino acid difference contributed to differential function of Tat proteins. In the present study, we documented that subtype B NL4-3 Tat and subtype C isolate HIV1084i Tat from pediatric patient in Zambia exhibited distinct nuclear localization by over-expressing fusion protein Tat-EGFP. Interestingly, 1084i Tat showed uniform nuclear distribution, whereas NL4-3 Tat primarily localized in nucleolus. The 57th amino acid, highly conserved between B-Tat (arginine) and C-Tat (serine), is located in the basic domain of Tat, and played an important role in this subcellular localization. Meanwhile, we found that substitution of arginine to serine at the site 57 decreases Tat transactivation of the HIV-1 LTR promoter.


Assuntos
Substituição de Aminoácidos , Genótipo , HIV-1/genética , HIV-1/metabolismo , Produtos do Gene tat do Vírus da Imunodeficiência Humana/metabolismo , Transporte Ativo do Núcleo Celular , Sequência de Aminoácidos , Infecções por HIV/virologia , Humanos , Espaço Intracelular , Mutação , Sinais de Localização Nuclear , Conformação de Ácido Nucleico , Matrizes de Pontuação de Posição Específica , Transporte Proteico , Proteínas Recombinantes de Fusão , Produtos do Gene tat do Vírus da Imunodeficiência Humana/química , Produtos do Gene tat do Vírus da Imunodeficiência Humana/genética
18.
Artif Intell Med ; 64(1): 51-8, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25863986

RESUMO

OBJECTIVES: Scientists have devoted decades of efforts to understanding the interaction between proteins or RNA production. The information might empower the current knowledge on drug reactions or the development of certain diseases. Nevertheless, due to the lack of explicit structure, literature in life science, one of the most important sources of this information, prevents computer-based systems from accessing. Therefore, biomedical event extraction, automatically acquiring knowledge of molecular events in research articles, has attracted community-wide efforts recently. Most approaches are based on statistical models, requiring large-scale annotated corpora to precisely estimate models' parameters. However, it is usually difficult to obtain in practice. Therefore, employing un-annotated data based on semi-supervised learning for biomedical event extraction is a feasible solution and attracts more interests. METHODS AND MATERIAL: In this paper, a semi-supervised learning framework based on hidden topics for biomedical event extraction is presented. In this framework, sentences in the un-annotated corpus are elaborately and automatically assigned with event annotations based on their distances to these sentences in the annotated corpus. More specifically, not only the structures of the sentences, but also the hidden topics embedded in the sentences are used for describing the distance. The sentences and newly assigned event annotations, together with the annotated corpus, are employed for training. RESULTS: Experiments were conducted on the multi-level event extraction corpus, a golden standard corpus. Experimental results show that more than 2.2% improvement on F-score on biomedical event extraction is achieved by the proposed framework when compared to the state-of-the-art approach. CONCLUSION: The results suggest that by incorporating un-annotated data, the proposed framework indeed improves the performance of the state-of-the-art event extraction system and the similarity between sentences might be precisely described by hidden topics and structures of the sentences.


Assuntos
Mineração de Dados/métodos , Informática Médica/métodos , Aprendizado de Máquina Supervisionado
19.
Comput Math Methods Med ; 2014: 298473, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25214883

RESUMO

Biomedical relation extraction aims to uncover high-quality relations from life science literature with high accuracy and efficiency. Early biomedical relation extraction tasks focused on capturing binary relations, such as protein-protein interactions, which are crucial for virtually every process in a living cell. Information about these interactions provides the foundations for new therapeutic approaches. In recent years, more interests have been shifted to the extraction of complex relations such as biomolecular events. While complex relations go beyond binary relations and involve more than two arguments, they might also take another relation as an argument. In the paper, we conduct a thorough survey on the research in biomedical relation extraction. We first present a general framework for biomedical relation extraction and then discuss the approaches proposed for binary and complex relation extraction with focus on the latter since it is a much more difficult task compared to binary relation extraction. Finally, we discuss challenges that we are facing with complex relation extraction and outline possible solutions and future directions.


Assuntos
Biologia/métodos , Mineração de Dados/métodos , Medicina/métodos , Humanos
20.
ScientificWorldJournal ; 2014: 121650, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25152899

RESUMO

Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure.


Assuntos
Modelos Estatísticos , Processamento de Linguagem Natural , Algoritmos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...